HTML Basics: Forms

Introduction

Forms were introduced in HTML 2.0 as a way for users to enter data in a document, which could be parsed and responded to on the server. Forms allow you to perform searches in databases, sending e-mail messages, questionnaires and other things.

There are only a few HTML elements which apply to forms, but they allow you to get almost every kind of data from your visitors. Everything from buttons to checkboxes and input areas is available.

The data which the form generates when it is submitted or sent to the server is encoded in a special way. This is called form encoding, and it is discussed below.

	Introduction to constructing forms		Input fields: checkboxes, radio buttons...
	Selection lists		Text input areas

To illustrate how the various form elements work, you can look at the example form. The idea is to construct a form in which people can comment on this HTML Basics series. To make it easier to fill in and to process, most of the information will be made available with selection lists, although there will also be room for personalized comments.

Introduction to constructing a form

The HTML FORM element is used to set up forms. It goes around all form elements which apply to this form, and takes two attributes. The first is METHOD, which can be either GET or POST. The second is ACTION, which points to the URL for the processing script on the server. It can be a relative URL, but most often it's a full URL. This script is often called a "CGI script".

Submitting a form: GET vs POST

The two different methods do not affect the form itself, but only the way in which the data is sent. With GET, the data is appended to the URL for the script, which is then retrieved like a normal document. This means the form input will show up in the "Current URL" window of your browser. It also means that the form input shouldn't be too long, as there are limitations on how long such an URL can be. The main advantage of GET is that you can "hard-code" a form submission in a document, so users can select a link to perform an automatic search on certain keywords, for example.

The method of POST allows unlimited amounts of data to be sent to the server. The data is sent in the body of the request for the form output and is presented on standard input to the processing script or program. The form input will not be shown when the form output is sent. Use POST if you have textareas on your form, or if you otherwise expect a large amount of data to be entered.

Form encoding

Form encoding is necessary to encode the form's contents to get the it to the server and the processing script correctly. The script then needs to decode the form input when it is processing it. This is most commonly done by a "front end" or script which intercepts the input and turns it into something the real program can use.

For all form elements, the input entered or selection made is represented by a name and a value. These two are combined into a pair. A pair is always of the form name=value. All pairs are sent to the server when the form is submitted, in a compact, concatenated form: name1=value&name2=value&name3=value. It is up to the server to split this concatenated list into something usable.

As the data entered by a user can contain reserved characters (such as the "=" or "&" character), the data can be encoded first. This encoding is simple: the character is replaced by its hexadecimal value in the ISO Latin 1 character set and prepended by the "%" character. The script can then turn it back into the actual character.

Input fields: INPUT

The INPUT tag is probably the most versatile HTML element. It is used for almost every way to get input from a user. The basic syntax is <INPUT TYPE=x NAME=y>, where "x" denotes one of the types discussed later, and NAME is a unique name for this input field. You can also often use the VALUE attribute to provide an initial value, and SIZE to suggest an appropriate size for the input field.

Of course, NAME, VALUE and SIZE are used for different purposes in every different type, and each type also has its own specific attributes.

TYPE=text

This creates a one-line input box, in which the user can enter some text. The VALUE attribute is used to set a default text in the input box. SIZE indicates how many characters long the input box should be, and MAXLENGTH can be used to indicate an upper limit for the input.

Its main purpose is to allow the user to fill in one-line entries, such as name, e-mail address or phone number. Make sure the field is long enough for its purpose. If the user enters more data than your SIZE attribute indicates, the text will usually scroll insize the box, which makes it impossible to review the complete entered text at once.

TYPE=password

This is identical to TYPE=text, but the text isn't shown on the screen. Instead, a pattern of "*" characters is used for each character entered. This provides some very basic method for a user to enter his password safely. The password is still sent in the clear, though.

In all other respects, it is identical to TYPE=text.

TYPE=checkbox

A checkbox is a two-state toggle. The user can check it ("on" state) or uncheck it ("off" state). If you want to supply text to explain the checkbox, you have to add that in the document itself. You can use the attribute CHECKED to indicate that the checkbox should come up in the "on" state rather than the "off" state.

Checkboxes are most commonly used in yes/no queries. It is best to let the "on" state represent something positive, so you don't get into trouble with double negatives, although this depends on how you design your form. For example, "I want to receive e-mail when this page gets updated" accompanying a checkbox is more clear than "I do not want to receive e-mail when this page is updated" even though both serve the same purpose.

Note that if the checkbox is in the "off" state when the form is submitted, it is completely ignored in the form input. If it is in the "on" state, it is sent with a VALUE of "on" regardless of what the VALUE attribute says.

TYPE=radio

Radio buttons always come in series. Each button in the series should have the same NAME, but a different VALUE. When the user selects one of the buttons in the series, that one will become checked and the previously checked one will become unchecked. Only one button can be checked at all times.

This allows for multiple-choice input, where all choices are mutually exclusive. Make sure that the choices are mutually exclusive, otherwise your reader might try to check more than one option at a time.

Make sure that you give exactly one of the radio buttons the attribute CHECKED, so it comes up as the default value. Without this, some browsers produce a list of radio buttons without any of them checked, which can produce unwanted side-effects when the form gets processed.

When the form gets submitted, the NAME of the series is sent together with the VALUE of the selected radio button.

TYPE=hidden

A hidden field simply stores information. It is not shown to the user. Its main purpose is to remember some important information for the processing form. This information is often generated by a previous form (then it is used to store the results from that form), or by server-side includes or a similar page-generating mechanism.

The browser will simply send the NAME and VALUE of the hidden field to the server, with no modifications at all.

TYPE=submit

The submit button is used to submit the form to the server. All elements on the page are evaluated, the form input is generated and sent off in the manner indicated in the FORM element.

You can have more than one submit button on your form, although they all perform the same task. Make sure you give each submit button a different NAME, so that the browser can also tell you which submit button was pressed (with the NAME and VALUE of the button).

The VALUE of the submit button is often used as the text to display on the button. If it is not set, a value of "Submit" is used as a default.

TYPE=image

This is an alternative to submit buttons, which only works well on graphical browsers. The SRC attribute (which must be present in this case) indicates the location of an image, which acts as a submit button. When the image is selected, the same actions as with pressing a submit button are performed.

In addition, the coordinates which the user clicked on are sent to the server. If, for example, the NAME is set to "myimage", then the browser would send "myimage.x=3" and "myimage.y=5" if the user clicked on the location (3,5) in the image.

TYPE=reset

The reset button restores the form to its initial state when pressed. All input boxes and textareas are reset to their original state (as specified in the VALUE attribute for an input box, and in the document for a text area), and checkboxes and radio buttons are unchecked unless the CHECKED attribute is present.

The VALUE of the reset button is often used as the text to display on the button. If it is not set, a value of "Reset" or "Clear" is used as default.

The reset button is ignored completely when the form is submitted.

Selection lists: SELECT

The SELECT tag is somewhat similar to a set of radio buttons, but it's more flexible. It produces a list of elements from which the user can choose. You can indicate (with the attribute MULTIPLE) that more than one element can be chosen. The SIZE attribute is used to indicate how many elements should be visible on the screen at once.

Typically, if the SIZE is set to one, the result is a "drop-down" box which presents a list of all the items. If it is set to more than one, the user can scroll through the items and pick the one(s) he wants.

Each element in the list is contained in an OPTION tag. An option may only contain plain text. Each option should have a unique NAME attribute. If you don't provide a NAME attribute for an option, the text which is displayed in the list is used as name. If you want the option to come up selected by default, use the SELECTED attribute.

When the form is submitted, the NAME of the select tag is combined with the name of the selected option and sent to the server.

Example

<P>
Please select the wine of your choice.
<SELECT NAME=wine>
<OPTION>Bordeaux
<OPTION NAME=beau>Beaujolais
<OPTION SELECTED>None
</SELECT>

In the case that the "Beaujolais" option is chosen, the value "beau" is sent to the server. In the other two cases, the actual text is used.

Text input areas: TEXTAREA

A TEXTAREA is basically like an input field, only bigger. You must indicate the number of rows and columns the text area should have, as well as the name of the text area. The browser will then lay out an area of the indicated size, in which the user can enter any amount of text he wants. The text is then sent as value for the text area when the form is submitted.

Unfortunately, many browsers have problems with line wrapping. This means that text entered by a user will not contain newlines unless the user explicitly entered them himself. This can result in very long lines, so you might want to rewrap the input in your processing script.

Default text for the text area can be specified by putting it inside the <TEXTAREA> and </TEXTAREA> tags. You can't use HTML tags here, but you should still encode characters such as &, < and > to prevent confusion. Line breaks you put in the default text should be retained when the text area is displayed.

See the example form for a "how-to" guide on constructing forms.

Reference index ~ HTML Basics index ~ Feedback